# Multi-scenario speech transcription

Whisper Fa Tinyyy
MIT
Persian automatic speech recognition model fine-tuned based on OpenAI Whisper-tiny, trained on the common_voice_11_0 dataset
Speech Recognition Transformers Other
W
hackergeek98
55
2
Whisper Large V3 Turbo Es
MIT
Spanish speech recognition model fine-tuned based on Whisper-large-v3-turbo, achieving a word error rate reduction to 5.34% on the Common Voice 17.0 Spanish dataset
Speech Recognition Transformers Spanish
W
adriszmar
52
4
Whisper Large V3 Turkish Test1
Apache-2.0
A speech recognition model fine-tuned on the Common Voice 17.0 Turkish dataset based on OpenAI Whisper-large-v3
Speech Recognition Transformers Other
W
erdiyalcin
21
3
Whisper Small Sinhala Fine Tune
Apache-2.0
A speech recognition model fine-tuned on Sinhala language based on OpenAI Whisper-small
Speech Recognition Transformers
W
Subhaka
78
6
Whisper Medium Turkish 2
Apache-2.0
Turkish speech recognition model fine-tuned based on OpenAI Whisper Medium, trained on the Common Voice 11.0 dataset
Speech Recognition Transformers Other
W
emre
267
15
Exp W2v2t Fa Hubert S801
Apache-2.0
A Persian automatic speech recognition model fine-tuned from facebook/hubert-large-ll60k, trained using the Common Voice 7.0 Persian dataset.
Speech Recognition Transformers Other
E
jonatasgrosman
16
0
Exp W2v2t Sv Se Vp Nl S842
Apache-2.0
This is a Swedish automatic speech recognition model fine-tuned based on the facebook/wav2vec2-large-nl-voxpopuli model, trained using the Common Voice 7.0 (sv-SE) dataset.
Speech Recognition Transformers
E
jonatasgrosman
16
0
Wav2vec2 Large Xls R 300m Turkish Colab
Apache-2.0
This model is a Turkish speech recognition model fine-tuned on the Common Voice dataset based on facebook/wav2vec2-xls-r-300m
Speech Recognition Transformers
W
pinot
16
0
W2v Xls R Uk
Apache-2.0
Ukrainian automatic speech recognition model based on facebook/wav2vec2-xls-r-300m, trained on the Common Voice 10.0 dataset
Speech Recognition Transformers Other
W
Yehor
231.46k
8
Wav2vec2 Large Xls R 300m Turkish Colab
Apache-2.0
This model is a Turkish speech recognition model fine-tuned on the common_voice dataset based on facebook/wav2vec2-xls-r-300m
Speech Recognition Transformers
W
bansals10
23
0
Wav2vec2 Large Xls R 300m Ur
Apache-2.0
Urdu speech recognition model based on the wav2vec2-large-xls-r-300m architecture, fine-tuned on the Common Voice dataset
Speech Recognition Transformers
W
anuragshas
20
0
Wav2vec2 Large Xls R 300m Urdu
Apache-2.0
A speech recognition model fine-tuned on the Common Voice 8 Urdu dataset based on facebook/wav2vec2-xls-r-300m
Speech Recognition Transformers Other
W
kingabzpro
91.36k
13
Wav2vec2 Common Voice Tr Demo
Apache-2.0
This model is an automatic speech recognition (ASR) model fine-tuned on the COMMON_VOICE SV-SE dataset based on facebook/wav2vec2-large-xlsr-53, supporting Swedish speech recognition.
Speech Recognition Transformers
W
birgermoell
17
0
Wav2vec2 Base 10k Voxpopuli Ft En
A Wav2Vec2 base model pre-trained on a 10K unlabeled subset of the VoxPopuli corpus and fine-tuned on English transcription data, suitable for English speech recognition tasks.
Speech Recognition Transformers English
W
facebook
40
1
Wav2vec2 Base 10k Voxpopuli Ft Ro
A speech recognition model based on Facebook's Wav2Vec2 architecture, fine-tuned for Romanian, suitable for automatic speech recognition tasks.
Speech Recognition Transformers Other
W
facebook
36
0
Wav2vec2 Base Sv Voxpopuli
A Wav2Vec2 base model pretrained on the Swedish subset of the VoxPopuli corpus, suitable for Swedish speech recognition tasks.
Speech Recognition Transformers Other
W
facebook
33
0
Wav2vec2 Large Xls R 300m Latvian
Apache-2.0
This is an automatic speech recognition model fine-tuned on Latvian datasets based on facebook/wav2vec2-xls-r-300m, achieving a WER of 16.98% on the Common Voice 7 test set.
Speech Recognition Transformers Other
W
infinitejoy
222
1
Xlsr Fa Lm
XLS-R-300m speech recognition model fine-tuned on Common Voice Persian data
Speech Recognition Transformers Other
X
manifoldix
16
1
Wav2vec2 Large Xlsr Greek 1
Apache-2.0
A speech recognition model fine-tuned on Greek language based on facebook/wav2vec2-large-xlsr-53, supporting 16kHz sampled audio input.
Speech Recognition Transformers Other
W
skylord
15
0
Wav2vec2 Base It Voxpopuli
Wav2Vec2 base model pretrained on unlabeled Italian data from VoxPopuli, suitable for speech recognition tasks.
Speech Recognition Transformers Other
W
facebook
32
0
Wav2vec2 Large Xls R 300m Turkish Colab
Apache-2.0
A speech recognition model fine-tuned on the Common Voice Turkish dataset based on facebook/wav2vec2-xls-r-300m
Speech Recognition Transformers
W
chaitanya97
23
0
Wav2vec2 Large Nl Voxpopuli
Automatic speech recognition model pre-trained on the Dutch subset of the VoxPopuli corpus
Speech Recognition Other
W
facebook
18
0
Wav2vec2 Xlsr Georgian
Apache-2.0
This model is an automatic speech recognition model fine-tuned on Georgian language datasets based on facebook/wav2vec2-xls-r-1b
Speech Recognition Transformers Other
W
sammy786
19
1
Wav2vec2 Xlsr Estonian
Apache-2.0
This is an automatic speech recognition model fine-tuned on Estonian datasets based on the facebook/wav2vec2-xls-r-1b model.
Speech Recognition Transformers Other
W
sammy786
21
1
Wav2vec2 Large Xls R 300m Spanish Custom
Apache-2.0
This is a speech recognition model fine-tuned on the Common Voice Spanish dataset based on the facebook/wav2vec2-xls-r-300m model, achieving a word error rate of 21.17% on the evaluation set.
Speech Recognition Transformers
W
tomascufaro
15
0
Wav2vec2 Large Xlsr 53 Portuguese
Apache-2.0
A large-scale Portuguese automatic speech recognition (ASR) model developed by Facebook based on the Wav2Vec 2.0 architecture, supporting Portuguese speech-to-text tasks.
Speech Recognition Other
W
facebook
425
6
Wav2vec2 Xls R 300m Uk
MIT
This is an automatic speech recognition (ASR) model fine-tuned on Ukrainian language datasets based on the facebook/wav2vec2-xls-r-300m model, achieving a 12.22% word error rate (WER) on the Common Voice Ukrainian test set.
Speech Recognition Transformers Other
W
robinhad
72
5
Wav2vec2 Large Xls R 300m Basque
Apache-2.0
An automatic speech recognition model fine-tuned on the Basque Common Voice dataset based on facebook/wav2vec2-xls-r-300m
Speech Recognition Transformers Other
W
deepdml
31
0
Wav2vec2 Base Turkish Cv8
This is an automatic speech recognition (ASR) model fine-tuned on the Common Voice 8.0 Turkish dataset, capable of converting Turkish speech into text.
Speech Recognition Transformers Other
W
cahya
16
1
Wav2vec2 Xls R 300m Cv8 Turkish
Apache-2.0
This is an automatic speech recognition (ASR) model fine-tuned on the Turkish Common Voice 8 dataset based on Facebook's wav2vec2-xls-r-300m model.
Speech Recognition Transformers Other
W
Baybars
16
0
Xls Npsc Oh
This model is an automatic speech recognition model fine-tuned on the NBAILAB/NPSC - 48K_MP3 dataset based on KBLab/wav2vec2-large-voxrex
Speech Recognition Transformers
X
NbAiLab
30
0
Xlsr 300m CV 8.0 50 EP New Params Nl
Apache-2.0
This is an automatic speech recognition (ASR) model based on the XLS-R architecture with 300M parameters, specifically optimized for Dutch and trained on the Common Voice 8.0 dataset.
Speech Recognition Transformers Other
X
Iskaj
25
0
Wav2vec2 Large Xls R 300m Bg V1
Apache-2.0
This is an automatic speech recognition (ASR) model fine-tuned on Bulgarian speech datasets based on the facebook/wav2vec2-xls-r-300m model.
Speech Recognition Transformers Other
W
DrishtiSharma
16
1
Wav2vec2 Large Xls R 300m Sl With LM V2
Apache-2.0
This is an automatic speech recognition (ASR) model fine-tuned on the Slovenian language (common_voice_8_0) dataset based on facebook/wav2vec2-xls-r-300m, supporting language model (LM) enhancement.
Speech Recognition Transformers Other
W
DrishtiSharma
26
0
Wav2vec2 Large Xls R 300m As V9
Apache-2.0
An automatic speech recognition model fine-tuned on the Assamese (Common Voice 8.0) dataset based on facebook/wav2vec2-xls-r-300m
Speech Recognition Transformers Other
W
DrishtiSharma
20
0
Wav2vec2 Common Voice Nl Demo
Apache-2.0
This is an automatic speech recognition (ASR) model fine-tuned on the Dutch COMMON_VOICE dataset based on the facebook/wav2vec2-large-xlsr-53 model.
Speech Recognition Transformers Other
W
MatsUy
16
0
Wav2vec2 Large Xls R 300m Pa IN Dx1
Apache-2.0
This is an automatic speech recognition model fine-tuned on Punjabi (India) dataset based on facebook/wav2vec2-xls-r-300m
Speech Recognition Transformers
W
DrishtiSharma
28
0
Wav2vec2 Large Xls R 300m Hsb V1
Apache-2.0
This is an automatic speech recognition model fine-tuned on the Upper Sorbian (HSB) dataset based on facebook/wav2vec2-xls-r-300m, achieving a word error rate (WER) of 0.4393 on the Common Voice 8 test set.
Speech Recognition Transformers Other
W
DrishtiSharma
20
0
Wav2vec2 Large Xlsr 53 Frisian
Apache-2.0
This is a Frisian automatic speech recognition (ASR) model fine-tuned based on the wav2vec2-large-xlsr-53 model, developed by RuudVelo.
Speech Recognition
W
RuudVelo
31
0
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase